Overview

Dataset statistics

Number of variables12
Number of observations115
Missing cells32
Missing cells (%)2.3%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory10.9 KiB
Average record size in memory97.1 B

Variable types

NUM10
CAT2

Warnings

year_month has a high cardinality: 63 distinct values High cardinality
transfer_value_gbp is highly correlated with transfer_value_eur and 6 other fieldsHigh correlation
transfer_value_eur is highly correlated with transfer_value_gbp and 6 other fieldsHigh correlation
transfer_value_inr is highly correlated with transfer_value_eur and 6 other fieldsHigh correlation
revenue_value_eur is highly correlated with transfer_value_eur and 6 other fieldsHigh correlation
revenue_value_gbp is highly correlated with transfer_value_eur and 6 other fieldsHigh correlation
revenue_value_inr is highly correlated with transfer_value_eur and 6 other fieldsHigh correlation
transfers is highly correlated with transfer_value_eur and 6 other fieldsHigh correlation
new_users is highly correlated with transfer_value_eur and 6 other fieldsHigh correlation
transfer_value_gbp has 6 (5.2%) missing values Missing
transfer_value_inr has 6 (5.2%) missing values Missing
revenue_value_gbp has 6 (5.2%) missing values Missing
revenue_value_inr has 6 (5.2%) missing values Missing
new_users has 2 (1.7%) missing values Missing
users has 2 (1.7%) missing values Missing
activer_user_rate has 4 (3.5%) missing values Missing
year_month is uniformly distributed Uniform
transfer_value_eur has unique values Unique
revenue_value_eur has unique values Unique

Reproduction

Analysis started2021-03-14 12:47:39.200397
Analysis finished2021-03-14 12:48:22.311143
Duration43.11 seconds
Software versionpandas-profiling v2.9.0
Download configurationconfig.yaml

Variables

year_month
Categorical

HIGH CARDINALITY
UNIFORM

Distinct63
Distinct (%)54.8%
Missing0
Missing (%)0.0%
Memory size920.0 B
2015-10
 
2
2015-06
 
2
2014-10
 
2
2016-02
 
2
2015-11
 
2
Other values (58)
105 
ValueCountFrequency (%) 
2015-1021.7%
 
2015-0621.7%
 
2014-1021.7%
 
2016-0221.7%
 
2015-1121.7%
 
2015-0121.7%
 
2018-0121.7%
 
2016-0921.7%
 
2018-0521.7%
 
2015-0721.7%
 
Other values (53)9582.6%
 
2021-03-14T12:48:22.476560image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Frequencies of value counts

Unique

Unique11 ?
Unique (%)9.6%
2021-03-14T12:48:22.726736image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length7
Median length7
Mean length7
Min length7

transfer_type
Categorical

Distinct2
Distinct (%)1.7%
Missing0
Missing (%)0.0%
Memory size920.0 B
Personal
63 
Business
52 
ValueCountFrequency (%) 
Personal6354.8%
 
Business5245.2%
 
2021-03-14T12:48:22.970987image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2021-03-14T12:48:23.137399image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-14T12:48:23.333557image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length8
Median length8
Mean length8
Min length8

transfer_value_eur
Real number (ℝ≥0)

HIGH CORRELATION
UNIQUE

Distinct115
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4746857.113
Minimum9184
Maximum22006935
Zeros0
Zeros (%)0.0%
Memory size920.0 B
2021-03-14T12:48:23.522608image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum9184
5-th percentile116695.4
Q1836308
median2821555
Q36415206.5
95-th percentile15697950.5
Maximum22006935
Range21997751
Interquartile range (IQR)5578898.5

Descriptive statistics

Standard deviation5220158.764
Coefficient of variation (CV)1.099708426
Kurtosis1.55856614
Mean4746857.113
Median Absolute Deviation (MAD)2452103
Skewness1.469592552
Sum545888568
Variance2.725005752e+13
MonotocityNot monotonic
2021-03-14T12:48:23.797300image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
544819210.9%
 
1159652210.9%
 
282155510.9%
 
1035512110.9%
 
77867210.9%
 
25182310.9%
 
84591110.9%
 
237508310.9%
 
126225010.9%
 
1841352910.9%
 
Other values (105)10591.3%
 
ValueCountFrequency (%) 
918410.9%
 
1084110.9%
 
2113210.9%
 
8450710.9%
 
9330510.9%
 
ValueCountFrequency (%) 
2200693510.9%
 
2163893310.9%
 
1896735610.9%
 
1841352910.9%
 
1723079710.9%
 

transfer_value_gbp
Real number (ℝ≥0)

HIGH CORRELATION
MISSING

Distinct109
Distinct (%)100.0%
Missing6
Missing (%)5.2%
Infinite0
Infinite (%)0.0%
Mean5846470.091
Minimum11636.91085
Maximum24529384.83
Zeros0
Zeros (%)0.0%
Memory size920.0 B
2021-03-14T12:48:24.097037image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum11636.91085
5-th percentile228693.2376
Q11243789.251
median3852180.708
Q38398004.041
95-th percentile18449571.54
Maximum24529384.83
Range24517747.92
Interquartile range (IQR)7154214.79

Descriptive statistics

Standard deviation5898512.235
Coefficient of variation (CV)1.008901464
Kurtosis1.153747523
Mean5846470.091
Median Absolute Deviation (MAD)2886438.416
Skewness1.333987807
Sum637265239.9
Variance3.479244659e+13
MonotocityNot monotonic
2021-03-14T12:48:24.368446image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
8398004.04110.9%
 
2605233.24110.9%
 
355162.997610.9%
 
1355941.10810.9%
 
638713.737610.9%
 
1082385.66410.9%
 
478372.438910.9%
 
8615640.87610.9%
 
4077663.19910.9%
 
11162009.110.9%
 
Other values (99)9986.1%
 
(Missing)65.2%
 
ValueCountFrequency (%) 
11636.9108510.9%
 
13634.2955310.9%
 
152348.692210.9%
 
154722.685110.9%
 
192080.28910.9%
 
ValueCountFrequency (%) 
24529384.8310.9%
 
24212700.5910.9%
 
21156207.4610.9%
 
20863089.0410.9%
 
19438138.5410.9%
 

transfer_value_inr
Real number (ℝ≥0)

HIGH CORRELATION
MISSING

Distinct109
Distinct (%)100.0%
Missing6
Missing (%)5.2%
Infinite0
Infinite (%)0.0%
Mean527869724.9
Minimum1141891.909
Maximum2272727262
Zeros0
Zeros (%)0.0%
Memory size920.0 B
2021-03-14T12:48:24.660804image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum1141891.909
5-th percentile22716200.49
Q1111967499
median341325031.9
Q3805031078.5
95-th percentile1678703748
Maximum2272727262
Range2271585370
Interquartile range (IQR)693063579.4

Descriptive statistics

Standard deviation531685052.9
Coefficient of variation (CV)1.007227783
Kurtosis1.499962486
Mean527869724.9
Median Absolute Deviation (MAD)259244705.3
Skewness1.402695249
Sum5.753780001e+10
Variance2.826889954e+17
MonotocityNot monotonic
2021-03-14T12:48:24.910710image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
621282899.310.9%
 
997471450.410.9%
 
103276278.410.9%
 
35204541.6110.9%
 
229332428.510.9%
 
123274516.110.9%
 
60753510910.9%
 
155498756910.9%
 
178991389110.9%
 
17008076510.9%
 
Other values (99)9986.1%
 
(Missing)65.2%
 
ValueCountFrequency (%) 
1141891.90910.9%
 
1397965.33410.9%
 
14508713.6210.9%
 
14601528.8610.9%
 
18803080.6910.9%
 
ValueCountFrequency (%) 
227272726210.9%
 
220562636010.9%
 
199928634510.9%
 
189602595110.9%
 
178991389110.9%
 

revenue_value_eur
Real number (ℝ≥0)

HIGH CORRELATION
UNIQUE

Distinct115
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean41420.08889
Minimum81.7376
Maximum191460.3345
Zeros0
Zeros (%)0.0%
Memory size920.0 B
2021-03-14T12:48:25.192620image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum81.7376
5-th percentile1038.58906
Q17358.5501
median24547.5285
Q356453.99515
95-th percentile136401.0505
Maximum191460.3345
Range191378.5969
Interquartile range (IQR)49095.44505

Descriptive statistics

Standard deviation45332.63639
Coefficient of variation (CV)1.094460142
Kurtosis1.538064474
Mean41420.08889
Median Absolute Deviation (MAD)21259.4057
Skewness1.461301193
Sum4763310.223
Variance2055047922
MonotocityNot monotonic
2021-03-14T12:48:25.483952image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
10981.57510.9%
 
131712.049210.9%
 
51639.882610.9%
 
45488.549710.9%
 
63992.719510.9%
 
6825.516810.9%
 
2554.602610.9%
 
100889.741410.9%
 
81.737610.9%
 
131866.013110.9%
 
Other values (105)10591.3%
 
ValueCountFrequency (%) 
81.737610.9%
 
96.484910.9%
 
190.18810.9%
 
760.56310.9%
 
839.74510.9%
 
ValueCountFrequency (%) 
191460.334510.9%
 
188258.717110.9%
 
163119.261610.9%
 
160197.702310.9%
 
149907.933910.9%
 

revenue_value_gbp
Real number (ℝ≥0)

HIGH CORRELATION
MISSING

Distinct109
Distinct (%)100.0%
Missing6
Missing (%)5.2%
Infinite0
Infinite (%)0.0%
Mean51040.12348
Minimum103.5685066
Maximum213405.648
Zeros0
Zeros (%)0.0%
Memory size920.0 B
2021-03-14T12:48:25.726771image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum103.5685066
5-th percentile2035.369815
Q110820.96648
median33513.97216
Q374742.23597
95-th percentile160125.5169
Maximum213405.648
Range213302.0795
Interquartile range (IQR)63921.26948

Descriptive statistics

Standard deviation51230.728
Coefficient of variation (CV)1.003734405
Kurtosis1.128386431
Mean51040.12348
Median Absolute Deviation (MAD)24918.86576
Skewness1.323510669
Sum5563373.459
Variance2624587491
MonotocityNot monotonic
2021-03-14T12:48:25.954110image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
4540.93204710.9%
 
23279.5724710.9%
 
128264.161610.9%
 
121.345230210.9%
 
3160.95067910.9%
 
10758.2988610.9%
 
108190.513910.9%
 
181943.384210.9%
 
7610.67232810.9%
 
11169.681910.9%
 
Other values (99)9986.1%
 
(Missing)65.2%
 
ValueCountFrequency (%) 
103.568506610.9%
 
121.345230210.9%
 
1355.90336110.9%
 
1377.03189710.9%
 
1709.51457210.9%
 
ValueCountFrequency (%) 
213405.64810.9%
 
210650.495110.9%
 
181943.384210.9%
 
181508.874710.9%
 
169111.805310.9%
 

revenue_value_inr
Real number (ℝ≥0)

HIGH CORRELATION
MISSING

Distinct109
Distinct (%)100.0%
Missing6
Missing (%)5.2%
Infinite0
Infinite (%)0.0%
Mean4609996.117
Minimum10162.83799
Maximum19772727.18
Zeros0
Zeros (%)0.0%
Memory size920.0 B
2021-03-14T12:48:26.195256image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum10162.83799
5-th percentile202174.1844
Q1984720.7318
median2969527.777
Q37003770.383
95-th percentile14604722.61
Maximum19772727.18
Range19762564.34
Interquartile range (IQR)6019049.651

Descriptive statistics

Standard deviation4619333.356
Coefficient of variation (CV)1.002025433
Kurtosis1.468881458
Mean4609996.117
Median Absolute Deviation (MAD)2255428.936
Skewness1.390854774
Sum502489576.7
Variance2.133824065e+13
MonotocityNot monotonic
2021-03-14T12:48:26.456893image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
5076115.51210.9%
 
978241.070210.9%
 
2969527.77710.9%
 
15514235.8610.9%
 
417117.840810.9%
 
9697233.56610.9%
 
5405161.22410.9%
 
4165300.91710.9%
 
3365083.99810.9%
 
1864004.11310.9%
 
Other values (99)9986.1%
 
(Missing)65.2%
 
ValueCountFrequency (%) 
10162.8379910.9%
 
12441.8914710.9%
 
129127.551210.9%
 
129953.606910.9%
 
167347.418110.9%
 
ValueCountFrequency (%) 
19772727.1810.9%
 
19188949.3310.9%
 
17393791.210.9%
 
16305823.1810.9%
 
15514235.8610.9%
 

transfers
Real number (ℝ≥0)

HIGH CORRELATION

Distinct114
Distinct (%)99.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3907.626087
Minimum2
Maximum22664
Zeros0
Zeros (%)0.0%
Memory size920.0 B
2021-03-14T12:48:26.682313image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum2
5-th percentile56.7
Q1252.5
median841
Q35584
95-th percentile16083.3
Maximum22664
Range22662
Interquartile range (IQR)5331.5

Descriptive statistics

Standard deviation5562.196624
Coefficient of variation (CV)1.423420896
Kurtosis2.016616615
Mean3907.626087
Median Absolute Deviation (MAD)773
Skewness1.67725933
Sum449377
Variance30938031.29
MonotocityNot monotonic
2021-03-14T12:48:26.910551image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
6121.7%
 
25710.9%
 
43810.9%
 
504510.9%
 
1118710.9%
 
555410.9%
 
708910.9%
 
68710.9%
 
68310.9%
 
810910.9%
 
Other values (104)10490.4%
 
ValueCountFrequency (%) 
210.9%
 
310.9%
 
1410.9%
 
3210.9%
 
3310.9%
 
ValueCountFrequency (%) 
2266410.9%
 
2234110.9%
 
1962810.9%
 
1904910.9%
 
1762510.9%
 

new_users
Real number (ℝ≥0)

HIGH CORRELATION
MISSING

Distinct90
Distinct (%)79.6%
Missing2
Missing (%)1.7%
Infinite0
Infinite (%)0.0%
Mean249.2831858
Minimum6
Maximum887
Zeros0
Zeros (%)0.0%
Memory size920.0 B
2021-03-14T12:48:27.140604image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum6
5-th percentile11.6
Q126
median81
Q3457
95-th percentile765.2
Maximum887
Range881
Interquartile range (IQR)431

Descriptive statistics

Standard deviation272.1527102
Coefficient of variation (CV)1.091741143
Kurtosis-0.5732599668
Mean249.2831858
Median Absolute Deviation (MAD)72
Skewness0.8817225764
Sum28169
Variance74067.09766
MonotocityNot monotonic
2021-03-14T12:48:27.423582image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
1643.5%
 
2643.5%
 
1732.6%
 
3321.7%
 
2421.7%
 
2221.7%
 
56621.7%
 
4221.7%
 
32521.7%
 
4621.7%
 
Other values (80)8876.5%
 
ValueCountFrequency (%) 
621.7%
 
810.9%
 
910.9%
 
1010.9%
 
1110.9%
 
ValueCountFrequency (%) 
88710.9%
 
87710.9%
 
87410.9%
 
86510.9%
 
82510.9%
 

users
Real number (ℝ≥0)

MISSING

Distinct63
Distinct (%)55.8%
Missing2
Missing (%)1.7%
Infinite0
Infinite (%)0.0%
Mean10911.37168
Minimum6
Maximum28169
Zeros0
Zeros (%)0.0%
Memory size920.0 B
2021-03-14T12:48:27.631433image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum6
5-th percentile236.6
Q13561
median9350
Q316828
95-th percentile26027
Maximum28169
Range28163
Interquartile range (IQR)13267

Descriptive statistics

Standard deviation8270.968743
Coefficient of variation (CV)0.7580136563
Kurtosis-0.9005422056
Mean10911.37168
Median Absolute Deviation (MAD)6509
Skewness0.4929948623
Sum1232985
Variance68408923.95
MonotocityNot monotonic
2021-03-14T12:48:27.895180image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
2074621.7%
 
1363721.7%
 
318221.7%
 
2473021.7%
 
1494521.7%
 
640821.7%
 
238121.7%
 
736121.7%
 
356121.7%
 
1561921.7%
 
Other values (53)9380.9%
 
ValueCountFrequency (%) 
610.9%
 
1810.9%
 
4410.9%
 
8210.9%
 
13010.9%
 
ValueCountFrequency (%) 
2816921.7%
 
2734721.7%
 
2657921.7%
 
2565921.7%
 
2473021.7%
 

activer_user_rate
Real number (ℝ≥0)

MISSING

Distinct111
Distinct (%)100.0%
Missing4
Missing (%)3.5%
Infinite0
Infinite (%)0.0%
Mean30.84157042
Minimum0.8691674291
Maximum416.6666667
Zeros0
Zeros (%)0.0%
Memory size920.0 B
2021-03-14T12:48:28.114693image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0.8691674291
5-th percentile1.198918514
Q11.632483811
median30.89187114
Q335.91640718
95-th percentile89.37060033
Maximum416.6666667
Range415.7974992
Interquartile range (IQR)34.28392337

Descriptive statistics

Standard deviation49.93757402
Coefficient of variation (CV)1.619164438
Kurtosis34.21563965
Mean30.84157042
Median Absolute Deviation (MAD)29.02841188
Skewness5.028845992
Sum3423.414316
Variance2493.761299
MonotocityNot monotonic
2021-03-14T12:48:28.709917image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
46.7001885610.9%
 
2.24484468810.9%
 
1.83366077310.9%
 
89.8936170210.9%
 
1.6479791410.9%
 
1.36943986810.9%
 
1.77458385310.9%
 
1.2192723710.9%
 
35.3092642310.9%
 
1.56881388210.9%
 
Other values (101)10187.8%
 
(Missing)43.5%
 
ValueCountFrequency (%) 
0.869167429110.9%
 
1.02511532510.9%
 
1.09519797810.9%
 
1.15651503510.9%
 
1.16690977310.9%
 
ValueCountFrequency (%) 
416.666666710.9%
 
233.333333310.9%
 
15010.9%
 
101.538461510.9%
 
93.9024390210.9%
 

Interactions

2021-03-14T12:47:49.394628image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-14T12:47:50.033823image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-14T12:47:50.271511image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-14T12:47:50.598338image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-14T12:47:50.828712image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-14T12:47:51.070113image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-14T12:47:51.451765image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-14T12:47:51.700784image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-14T12:47:51.828783image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-14T12:47:51.966027image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-14T12:47:52.106577image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-14T12:47:52.278732image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-14T12:47:52.503097image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-14T12:47:52.766551image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-14T12:47:53.002848image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-14T12:47:53.189236image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-14T12:47:53.577730image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-14T12:47:54.047375image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-14T12:47:54.384213image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-14T12:47:54.529938image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-14T12:47:54.670785image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-14T12:47:54.810746image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-14T12:47:54.962937image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-14T12:47:55.191556image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-14T12:47:55.558418image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-14T12:47:55.741018image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-14T12:47:55.904539image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-14T12:47:56.042372image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-14T12:47:56.257499image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-14T12:47:56.656608image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-14T12:47:57.014910image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-14T12:47:57.332971image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-14T12:47:57.974733image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-14T12:47:58.543682image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-14T12:47:58.996860image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-14T12:47:59.488669image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-14T12:47:59.908670image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-14T12:48:00.350653image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-14T12:48:00.650776image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-14T12:48:01.080814image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-14T12:48:01.319777image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-14T12:48:01.557441image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-14T12:48:01.838579image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-14T12:48:02.089093image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-14T12:48:02.514915image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-14T12:48:02.964197image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-14T12:48:03.347685image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-14T12:48:03.721917image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-14T12:48:04.164547image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-14T12:48:04.498396image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-14T12:48:04.856833image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-14T12:48:05.333004image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-14T12:48:05.672311image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-14T12:48:06.272568image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-14T12:48:06.882064image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-14T12:48:07.291088image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-14T12:48:07.748450image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-14T12:48:08.190587image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-14T12:48:08.577776image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-14T12:48:09.136520image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-14T12:48:09.438448image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-14T12:48:09.667519image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-14T12:48:09.829436image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-14T12:48:09.954459image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-14T12:48:10.229719image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-14T12:48:10.429068image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-14T12:48:10.578713image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-14T12:48:10.701053image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-14T12:48:10.834599image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-14T12:48:11.011860image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-14T12:48:11.253698image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-14T12:48:11.677740image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-14T12:48:12.344324image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-14T12:48:12.806763image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-14T12:48:13.152495image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-14T12:48:13.361745image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-14T12:48:13.772425image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-14T12:48:13.940665image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-14T12:48:14.136461image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-14T12:48:14.391104image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-14T12:48:14.558913image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-14T12:48:14.816753image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-14T12:48:15.167400image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-14T12:48:15.609177image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-14T12:48:16.146520image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-14T12:48:16.658559image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-14T12:48:17.123927image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-14T12:48:17.408539image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-14T12:48:17.581701image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-14T12:48:17.897772image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-14T12:48:18.386754image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-14T12:48:18.764128image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-14T12:48:19.004402image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-14T12:48:19.180237image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-14T12:48:19.421615image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-14T12:48:19.604640image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-14T12:48:19.776470image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-14T12:48:19.902549image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-14T12:48:20.046453image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-14T12:48:20.288429image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Correlations

2021-03-14T12:48:29.070662image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2021-03-14T12:48:29.547380image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2021-03-14T12:48:29.879555image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2021-03-14T12:48:30.303059image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.
2021-03-14T12:48:30.667079image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.

Missing values

2021-03-14T12:48:20.696650image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-14T12:48:21.104511image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-14T12:48:21.812768image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-14T12:48:22.044543image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Sample

First rows

year_monthtransfer_typetransfer_value_eurtransfer_value_gbptransfer_value_inrrevenue_value_eurrevenue_value_gbprevenue_value_inrtransfersnew_usersusersactiver_user_rate
02018-12Business53435805.959794e+065.340085e+0845954.788051254.2307064.592473e+06107646.028169.01.660146
12018-12Personal189673562.115621e+071.896026e+09163119.2616181943.3841931.630582e+0719628776.028169.030.891871
22018-11Business48505675.513850e+065.129096e+0842053.055347804.9151594.447388e+0698138.027347.01.670492
32018-11Personal169601381.926924e+071.789914e+09146982.8044166999.4906071.551424e+0717533730.027347.030.050040
42018-10Business56587456.411979e+066.145070e+0849231.081555784.2213505.346211e+06112255.026579.01.769360
52018-10Personal184135292.086309e+071.999286e+09160197.7023181508.8746661.739379e+0719049865.026579.032.425270
62018-09Business59183416.620784e+066.212829e+0851489.566757600.8189725.405161e+06119242.025659.01.953093
72018-09Personal216389332.421270e+072.272727e+09188258.7171210650.4951021.977273e+0722341887.025659.035.960372
82018-08Business59944916.686518e+066.005697e+0852152.071758172.7084985.224957e+06120755.024730.01.911684
92018-08Personal220069352.452938e+072.205626e+09191460.3345213405.6480101.918895e+0722664874.024730.035.872442

Last rows

year_monthtransfer_typetransfer_value_eurtransfer_value_gbptransfer_value_inrrevenue_value_eurrevenue_value_gbprevenue_value_inrtransfersnew_usersusersactiver_user_rate
1052014-07Personal766912965742.2913619.894196e+076825.51688595.106393880583.404020549149.0636.066.324435
1062014-06Personal689455855131.7222168.609865e+076136.14957610.672328766277.964540490129.0487.084.357542
1072014-05Personal521607638713.7375606.400720e+074642.30235684.552264569664.11757836889.0358.088.847584
1082014-04Personal378156258461.3612812.625997e+073365.58842300.306115233713.77039227181.0269.089.893617
1092014-03Personal287034NaNNaN2554.6026NaNNaN20858.0188.0101.538462
1102014-02Personal159706NaNNaN1421.3834NaNNaN11048.0130.093.902439
1112014-01Personal158711NaNNaN1412.5279NaNNaN11438.082.0150.000000
1122013-12Personal93305NaNNaN839.7450NaNNaN6826.044.0233.333333
1132013-11Personal84507NaNNaN760.5630NaNNaN6012.018.0416.666667
1142013-10Personal21132NaNNaN190.1880NaNNaN146.06.0NaN